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We have developed a versatile Escherichia coli expression system based on the use of E. coli thioredoxin 
(IrxA) as a gene fusion partner. The broad utility of the system is illustrated by the production of a variety 
of mammalian cytokines and growth factors as thioredoxin fusion proteins. Although many of these 
cytokines previously have been produced in E. coli as insoluble aggregates or "inclusion bodies", we show 
here that as thioredoxin fusions they can be made in soluble forms that are biologically active. In general 
we find that linkage to thioredoxin dramatically increases the solubility of heterologous proteins synthe- 
sized in the E. coli cytoplasm, and that thioredoxin fusion proteins usually accumulate to high levels. Two 
additional properties of E. coli thioredoxin, its ability to be specifically released from the E. coli cytoplasm 
by osmotic shock or freeze/thaw treatments and its intrinsic thermal stability, are retained by some fusions 
and provide convenient purification steps. We also find that the active-site loop ofE. coli thioredoxin can be 
used as a genera! site for small peptide insertions, allowing for the high level production of soluble peptides 
in the E. coli cytoplasm. 
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The production of protein pharmaceuticals in Escheri- 
chia coli is a cornerstone of the biotechnology indus- 
try. However a number of difficulties are frequently 
encountered when expressing heterologous genes in 
this organism. For example, the significant differ- 
ences that exist between E. coli and human genes, 
both in their patterns of codon usage and in their 
translation initiation signals, can interfere with the efficient 
translation of human messenger RNA on bacterial ribosomes'. 
Alternatively, heterologous proteins synthesized in E. coli may 
fail to accumulate to significant levels due to the activity of host 
cell proteases 2 . In addition, the physical characteristics of many 
therapeutically useful proteins can cause problems, since many 
in their native state are secreted molecules requiring glycosyla- 
tion and disulfide-crossl inking for both stability and solubility. 
Such stabilizing influences are unavailable in the bacterial 
cytoplasm, with the result that heterologous proteins made 
within E. coli often appear as insoluble aggregates known as 
"inclusion bodies*' 34 . 

Although production of a protein as an insoluble "inclusion 
body" can offer the advantage of an easy purification, devising 
an appropriate solubilization and refolding procedure is an 
empirical process, with no guarantee of success in all cases. As 
an alternative to refolding misfolded proteins produced in the 
E. coli cytoplasm, the secretion of proteins into the periplasm 
of E. coli has sometimes proven successful 5 ft . However the 
yields of protein obtained from these E. coli secretion systems 
rarely approach those obtained by intracellular expression, and 
in many instances problems with protein stability and solubility 
in the periplasm have been reported 7 8 . 

A popular strategy that avoids some of the problems associ- 
ated with other expression methods is to link the gene of interest 
to a second gene which is already known to be expressed well in 
E. coli, to generate a fusion protein. Most of the successful 
fusion protein systems position the protein of interest at the C- 
terminal end of the highly-expressed fusion partner, to ensure 
efficient translation initiation. However although some of the 
original fusion systems, such as those employing £. coli lacZ 9 or 
trpE 10 as the highly-expressed partners, can successfully resolve 



translation initiation difficulties, they do not always solve the 
intrinsic solubility problems of heterologous proteins made in 
E. coli. The fusion proteins produced by these systems are also 
frequently found in inclusion bodies. There are other systems, 
such as those employing Staphylococcus protein A l \ m Schisto- 
soma glutathione-S-transferase 12 , and the E coli maltose bind- 
ing protein, ma/E'\ as fusion partners, that are much more 
successful in producing soluble fusion proteins. Additionally in 
each of these cases the fusion partner provides a distinct bio- 
chemical property, for example amylose binding in the maltose- 
binding protein system, which can be exploited as an affinity tag 
for fusion protein purification. 

Here we describe a new fusion gene expression system based 
on the use of E. coli thioredoxin (trxA) iA as the fusion partner. 
We have found that E. coli thioredoxin bears a number of 
characteristics which make it a particularly suitable choice in 
this role. When over-expressed from plasmid vectors, E. coli 
thioredoxin can accumulate to 40% of the total cellular protein 15 , 
and even at these expression levels all of the protein remains in 
the soluble fraction. Since thioredoxin is small (11,675 kD) it 
usually represents a relatively modest portion of any fusion 
protein, in contrast to other systems where the fusion partner 
itself may comprise most of the fusion's total mass. The tertiary 
structure reveals that both the N- and C- termini of thioredoxin 
are accessible on the molecule's surface "\ in good positions for 
potential fusions to other proteins. We show here that a wide 
variety of secreted mammalian cytokines and growth factors can 
be successfully produced in a soluble form in the E. coli cyto- 
plasm as C-terminal fusions to thioredoxin. The tertiary struc- 
ture also shows that the characteristic thioredoxin active-site, 
-CGPC-, protrudes from the body of the protein as a surface 
loop. We find that this thioredoxin active-site loop can be used as 
a site for internal peptide fusions. 

Thioredoxin possesses two further characteristics which can 
be exploited for fusion protein purifications. The first is the 
inherent thermal stability of the molecule, a property that is 
retained by some thioredoxin fusions and which enables heat 
treatments to be used as an effective purification tool. The 
second additional property relates to thioredoxin's cellular loca- 
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FIGURE 1. A diagram illustrating the main features of the 
thioredoxin gene fusion expression vector, pTRXFUS. The 
DNA sequence is shown for the 3 -end of the thioredoxin gene, 
the "linker" region encoding an enterokinase (enteropepti- 
dase) cleavage site, and a "poly tinker" sequence containing 
convenient restriction endonuclease cloning sites. The 
sequence surrounding the active-site loop of thioredoxin is 
also shown to illustrate the single Rsrll site which can be used 
for peptide sequence insertions at this location. Abbreviations 
used are: trxA, the E. coli thioredoxin gene; BLA, (0-lactamase 
gene; ori, colH1 replication origin; pL, bacteriophage A major 
leftward promoter; aspA terminator, E. coli aspartate amino- 
transferase transcription terminator. 
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FIGURE 2. The proteins found in the soluble fractions derived 
from E. coli cells expressing eleven different thioredoxin gene 
fusions. The numbers in parentheses refer to the growth tem- 
perature chosen for expressing each particular fusion. Lane 1 : 
host E. coli strain GI724 (negative control, 37°C), lane 2: mu- 
rine IL-2 (15°C), lane 3: human IL-3 (15°C), lane 4: murine IL-4 
(15°C), lane 5: murine IL-5 (15°C), lane 6: human IL-6 (25°C), 
lane 7: human MIP-1a (37°C), lane 8: human IL-11 (37°C), lane 
9: human M-CSF (37°C), lane 10: murine LIF (25°C), lane 11: 
murine SF (37°C), and lane 12: human BMP- 2 (25°C). Shown is 
a 10% SDS-PAGE gel, stained with Coomassie blue. 

tion. Although E. coli thioredoxin is a cytoplasmic protein, it 
has been shown to occupy a special position within the cell, 
being mainly located on the cytoplasmic face of the adhesion 
zones that exist between the inner and outer membranes 
of the £. coli cell envelope 17 . From this location thioredoxin 



has been shown to be quantitatively released to the exterior of the 
cell by simple osmotic shock or freeze/lhaw treat menis'*. a 
remarkable property which we show here is retained by some 
thioredoxin fusion proteins, providing a simple and convenient 
purification step. 

Results 

Production of fusion proteins. The E. coli expression vec- 
tor pTRXFUS (Fig. 1) was used as the basis for joining genes 
encoding a variety of mammalian growth factors and cytokines 
to the 3'- end of £. coli trxA. The coding sequence for each 
cytokine was inserted into pTRXFUS at the unique Kpnl site in 
the 3'-poIyIinker to create an in-frame translational fusion. Gene 
expression under the transcriptional control of the bacteriophage 
lambda pL promoter was performed in strain GI724, with the 
growth temperature for expression in each case chosen so as to 
maximize soluble accumulation of the particular fusion protein, 
ranging from 15°C to 37°C. 

To assess the expression and solubility of each of the fusion 
proteins, cultures were harvested approximately four hours post- 
induction, the cell pellets were lysed in a French pressure cell 
and the resulting lysates separated into soluble and insoluble 
fractions by centrifugation at 15,000xg (see Experimental Proto- 
col). The soluble portion was loaded onto a 10% SDS-polyacryl- 
amide gel. Figure 2 shows that each of the fusion proteins was 
expressed well, to approximately 5-20% of the total cell protein. 
Importantly, all of the fusion proteins were present mainly or 
entirely in the soluble cellular fraction under the growth condi- 
tions chosen. In contrast , previous attempts at intracellular 
expression of many of these proteins in E. coli have been 
reported by other groups, and have invariably led to the forma- 
tion of insoluble "inclusion bodies", for example with: IL-2 
(ref. 19), IL-3 (ref. 20), IL-4 (ref. 21), IL-6 (ref. 22), SF (ref. 
23), and M-CSF (ref. 24). In fact, we have observed that IL-3, 
IL-6, and BMP-2 are produced in inclusion bodies at 15 °C when 
expressed using a comparable vector and strain but without the 
aid of fusion to thioredoxin (data not shown). 

Many of the clarified lysates containing soluble fusion pro- 
teins were tested for in vitro biological activity (Table 1). All of 
the fusions which contained heterologous partner proteins usu- 
ally found as monomers in their natural state exhibited some 
biological activity. A number of the fusions tested, including the 
the IL-3, IL-6 and IL-11 fusions, were fully active. Others, such 
as the IL-2, IL-4, IL-5, LIF and SF fusions, displayed significant 
activity, sometimes diminished when compared to the native 
factors. Partial biological activity would not be unexpected for a 
subset of thioredoxin-cytokine fusion proteins, since for some 
the receptor binding site may be obscured by the thioredoxin 
domain. Thioredoxin fusions to covalently-linked dimeric heter- 
ologous proteins, for example BMP-2 and M-CSF, were inactive 
in bioassays, perhaps due to their inability to form dimers when 
fused to thioredoxin. However, MlPla, while inactive as a 
thioredoxin fusion, attained full bioactivity when cleaved from 
its partner (data not shown). 

The high levels of solubility and biological activity observed 
in most cases suggests that heterologous proteins can, in gen- 
eral, assume their proper native conformations while fused to 
thioredoxin and produced in the E. coli cytoplasm. 

Thermal stability of thioredoxin fusions. A remarkable 
characteristic of E. coli thioredoxin is its ability to withstand 
prolonged incubation at elevated temperatures without undergo- 
ing irreversible thermal denaturation. Indeed, purification pro- 
cedures for isolating E. coli thioredoxin from cell lysates have 
utilized 85°C incubations for precipitating most of the other 
proteins, while retaining thioredoxin in the soluble fraction 14 - 
We find that this property of thermal stability is shared by some 
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ihioredoxin fusion proteins, providing in those cases a simple 
and rapid purification step. Figure 3 shows the time course for 
an 80°C heat-treatment of a cell lysate from a strain engineered 
to express a lhioredoxin/MlP-la fusion protein. The fusion 
remained fully soluble throughout the 10 minute incubation 
period at 80°C. Most contaminating E. coli proteins denatured 
and precipitated after just one minute, resulting in a 5-fold 
purification of the fusion protein in the recovered soluble frac- 
tion. Since this kind of purification step is dependent on the 
thermal stability of each particular heterologous partner protein, 
it will not be applicable to all ihioredoxin fusions. However it is 
a useful option of the ihioredoxin system that is unavailable with 
other fusion systems. Additionally we have observed that pep- 
tide insertions into the active-site loop of thioredoxin (see below) 
also often retain the thermal stability of native thioredoxin, 
affording easy purifications for these molecules. 

Selective release of thioredoxin fusions. E. coli thioredoxin 
has been shown to preferentially reside at sites around the inner 
periphery of the cytoplasmic membrane in E. coli known as 
adhesion zones, or Bayer's patches 17 : \ At these sites there are 
gaps in the peptidoglycan cell wall where the inner and outer cell 
membranes are fused together, leading to an adjacent "osmoti- 
cally sensitive" cellular compartment whose contents can be 
released to the surrounding medium by a rapid osmotic shock 
treatments in the presence of EDTA. It has been observed, 
perhaps surprisingly, that two particular E. coli cytoplasmic 
proteins, EF-Tu and thioredoxin, can be selectively released by 
such treatment. Indeed thioredoxin has been reported to be 
quantitatively expelled from the cytoplasm by osmotic-shock 
treatments, while most other cytoplasmic proteins remain 
encapsulated by the cytoplasmic membrane 1 *. 

To test whether thioredoxin fusion proteins would also dis- 
play selective localization and release, we subjected cells 
expressing different thioredoxin/cytokine fusion constructs to 
osmotic-shock treatments (see Experimental Protocol). Figure 
4 shows the results obtained with ove rex pressed E. coli thiore- 
doxin, the MlP-la fusion and the IL-11 fusion. As reported 
previously 18 thioredoxin itself was quantitatively released by 
osmotic shock, even when produced at >20% of the total cell 
protein (Fig. 4, lane 2). Cells expressing thioredoxin/MIP-la 
were found to release about 30% of the fusion protein upon 
osmotic shock (Fig. 4, lane 5). Up to 70% of the thioredoxin/ 
IL-11 fusion could be osmotically released from the cytoplasm, 
resulting in a significant purification (Fig. 4, lane 8) for a non- 
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FIGURE 3. Protein remaining in solution following heat-treat- 
ment of a thioredoxin fusion protein at 80 °C. E. coli celts pro- 
ducing the tbioredoxin/MIP-1a fusion protein were lysed in a 
french pressure cell and the lysate heated at 80 °C. Aliquots 
were removed at the specified times, cooled on ice, clarified by 
centrifugation, and the soluble fractions loaded onto a 10% 
SDS-PAGE gel. Proteins were visualized with Coomassie blue. 
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TABLE 1. In vitro biological activities, in crude bacterial 
lysates, for various thioredoxin/cytokine fusion proteins. 

Specific Percent Activity 
Thioredoxin Activity or Native 
Fusion (units/mg) a Cytokine b 



Assay Name Reference 



murine IL-2 


1.1 x 10* 


n/d 


32D clone 23 


159] 


human JL-3 


5.5x100 


100 


M-07e 


[57] 


murine 1L-4 


1.7x105 


n/d 


32D clone 23 


[59] 


murine IL-5 


1.8x105 


n/d 


TF-1 


[61] 


human JL-6 


4.8x10" 


100 


Tl 165 


[27] 


human MlPla 


not active c 


0 


CFU-A 


[60] 


human IL-11 


2.5x10* 


100 


Tl 165 


|27) 


human M-CSF 


not active 


0 


bone marrow 


162] 


murine LIF 


2.5x10" 


2.5 


DA la 


[63] 


murine SF 


5.0 X10* 


n/d 


M-07e 


[57] 


human BMP-2 


not active 


0 


W-20 


158] 



Specific activity is expressed as dilution units per milligram of cytokine. 
Activities were measured in crude bacterial lysates. the protein concentra- 
tions of fusions present in these lysates were estimated from stained SDS- 
polyacrylamide gels. h In a number of cases the activity of a native cytokine 
in a particular assay was not known. In these instances comparisons with 
the activities of ihioredoxin fusions could not be made (signified by n/d). 
c AIthough the thioredoxin-MIPIa fusion protein was itself inactive, 
fusion -de rived MlPla was fully active in the CFU-A assay. 



FIGURE 4. Selective osmotic release of thioredoxin fusion pro- 
teins. E. coli cells producing native thioredoxin (lanes 1-3), the 
thioredoxin/MIP-1a fusion (lanes 4-6), and the thioredoxin/ 
IL-11 fusion (lanes 7-9) were subjected to an osmotic release 
procedure (see Experimental Protocol). Samples representing 
equivalent amounts of whole cells (lanes 1,4,7), material 
released from whole cells by osmotic shock (lanes 2,5,8), and 
residual proteins not released from cells by osmotic shock 
(lanes 3,6,9), were run on a 10% tricine SDS-polyacrylamide 
gel and were subsequently visualized with Coomassie blue. 
The mobilities on this gel of thioredoxin, the thioredoxin/ 
MIP-1a and thioredoxin/IL-11 fusion proteins, and EFTu are 
shown by arrows. 



chromatographic procedure. It is interesting to notice in Figure 
4 that approximately half of the EF-Tu present in these cells was 
also released by the osmotic shock treatments. 

As an alternative means of selective release, equivalent 
results were obtained when cells were subjected to a simple 
freeze/thaw treatment in the presence of EDTA, instead of the 
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FIGURE 5. Specific cleavage of the thioredoxin/IL-11 fusion 
protein by enteropeptidase. Shown is a Coomassie blue- 
stained, 10% tricine SDS-polyacrylamide gel of: partially-puri- 
fied thioredoxin/IL-11 fusion protein (lane 1), the same protein 
sample following cleavage with enteropeptidase (lane 2), and 
IL-11 subsequently purified from the reaction products (lane 3). 
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FIGURE 6. Specific cleavage of TRX-MPHANT by enteropepti- 
dase. Partially purified TRX-MPHANT, a thioredoxin fusion pro- 
tein bearing a 25 residue insertion in the active site loop, was 
incubated in a reaction with (lanes 4-6) or without enteropepti- 
dase (lanes 1-3). Reaction products were run on a 10% tricine 
SDS-polyacrylamide gel, which was subsequently stained with 
Coomassie blue. Shown are timepoints taken at the start of the 
incubation (lanes 1 and 4), after 5h (lanes 2 and 5), and after 
22h (lanes 3 and 6). 



osmotic shock procedure (data not shown). This had previously 
been reported for native thioredoxin 18 and was found to work 
well for some fusions. The yields of released material following 
the selective release procedures varied depending upon the par- 
ticular C-terminal fusion partner protein, and on the growth 
stage of the cells (J.M.M. and E.A.D., unpublished data). 
Release was at a maximum during exponential growth, and 
decreased as cells approached stationary phase. 

Cleavage of thioredoxin fusion proteins. The ability to 
efficiently cleave and separate thioredoxin from fused polypep- 
tides is essential for this expression system to be a practical 
method for producing therapeutically useful proteins in E. coli. 
With this in mind we designed the linker peptide lying between 



the thioredoxin and fused C-terminal domains to include the 
sequence -DDDDK-. which is the recognition sequence for the 
mammalian intestinal protease enteropeptidase. Enteropepti- 
dase cleaves specifically following the P, lysine residue-^, and is 
tolerant to a wide variety of amino-acid residues in the P,' 
position (E.R.L. and L.A.C., unpublished observations). A 
thioredox in/human IL-II fusion protein was produced and puri- 
fied from E, coli cell lysates, as described in the Experimental 
Protocol, and incubated in a reaction with bovine enteropepti- 
dase. Figure 5 shows that the thioredoxin/IL-11 fusion in this 
reaction was cleaved specifically at the enteropeptidase cleavage 
site in the linker peptide, resulting in a 21 kD product corres- 
ponding to mature human IL-11, and a 13 kD product consisting 
of thioredoxin with the 10 amino acid linker peptide still attached 
to its C-terminus (lane 2). Amino-terminal sequencing of the 
IL-11 product confirmed that cleavage was specific at the 
expected site, leaving a homogeneous N-terminus (data not 
shown). Under the reaction conditions most of the fusion protein 
was cleaved in 15 hours. The cleavage could be driven to com- 
pletion, however, by longer incubation or by the addition of 
more enzyme. Separation of the cleaved IL-11 product from both 
the thioredoxin/l inker portion and the remaining uncleaved 
fusion protein was easily achieved by exploiting the differences 
in charge between the three components. Mature human IL-11 is 
very basic, with a pi (predicted from its amino acid sequence) of 
11 T\ E. coli thioredoxin is acidic, with a pi of about 4.5 14 , with 
the four aspartic acid residues in the linker peptide increasing 
this predominance of negative charge. As a result, IL-11 was 
efficiently separated from the other components of the cleavage 
reaction by chromatography on QAE-Toyopearl. At pH 8.0 in a 
low ionic strength buffer (25 mM HEPES), cleaved IL-11 did not 
bind anion-exchange resins and was recovered at 100% yield 
(Fig. 5, lane 3). Uncleaved fusion and the thioredox in/1 inker 
moiety bound quantitatively and were effectively removed. 
Bovine enterokinase also bound to anion exchange resins and 
was quantitatively removed from the IL-11 product. In addition, 
a substantial reduction in endotoxin levels was achieved in this 
step (data not shown). The resulting IL-11 exhibited full bioactiv- 
ity in the T1165 assay 27 . 

Expression of peptides in the thioredoxin active-site loop. 
Often it may be desirable to produce short peptide sequences in 
E. coli, for instance to use as immunogens or for screening 
purposes 2 *. However, such peptides frequently have inherent 
solubility problems, or they may be sensitive to degradation by 
host cell proteases. As a way to avoid these problems we tested 
thioredoxin^ suitability as a general fusion partner for peptide 
expression in E. coli. 

We suspected that peptides fused at the N- or C-termini of 
thioredoxin might be susceptible to E. coli amino- and carboxy- 
peptidases. That suspicion was confirmed by the low expression 
level and proteolytic lability we observed when we attempted to 
produce a thioredoxin fusion protein with a 20-residue C-termi- 
nal peptide extension (data not shown). To avoid this potential 
problem, we chose instead to use an internal location within 
thioredoxin as a peptide fusion site. The tertiary structure of 
thioredoxin shows that the active-site of the molecule, -C^-G-P- 
C, 5 -, is a surface loop that protrudes from the body of the 
protein, and thus presumably contributes little to overall struc- 
tural stability. A convenient RsrII restriction site lies in the DNA 
sequence encoding the loop, cutting between G w and P^* This 
RsrII site was used to introduce a segment of DNA encoding a 
25 residue peptide sequence, -PGSGRPLAVKVFSYIDDDDK 
GPGSG-, into the thioredoxin gene. The resulting fusion protein 
(called TRX-MPHANT) was found to accumulate to high levels 
in £. coli. and was equally as stable to 80°C heat treatments as 
wild-type thioredoxin (data not shown), indicating that a peptide 
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insertion into the active site loop did not adversely effect the 
folding or final stability of the thioredoxin/peptide fusion mole- 
cule. We have inserted a wide variety of other peptide sequences 
of between 14 and 25 residues in length at this location and have 
found that the active-site loop is very permissive, almost invari- 
ably allowing for the high-level stable expression of most pep- 
tides as fusions. The great majority of these fusion proteins 
are soluble. 

By design the peptide insert in TRX-MPHANT contains an 
enteropeptidase recognition sequence, -DDDDK-. To deter- 
mine whether or not this insert was displayed on the surface of 
thioredoxin, partially purified TRX-MPHANT was incubated in 
the presence of enteropeptidase (Fig. 6, lanes 4-6). A specific 
cleavage was observed which was not seen when enteropepti- 
dase was either omitted from the reaction (Fig. 6, lanes 1-3), or 
when native thioredoxin was used in place of the TRX- 
MPHANT fusion (data not shown). The specific cleavage by 
enteropeptidase of TRX-MPHANT indicates that the inserted 
peptide in this fusion is in an accessible location in thioredoxin. 
In further support of this notion we have raised monoclonal 
antibodies, using TRX-MPHANT as the immunogen, which 
specifically bind to TRX-MPHANT itself but which do not bind 
to native thioredoxin. This further demonstrates that peptides 
inserted into the active-site loop of thioredoxin are accessible, 
and that such fusions are potentially useful immunogens (data 
not shown). 

Discussion 

In contrast to many other gene fusion systems, the thiore- 
doxin gene fusion system provides a solution for a major prob- 
lem which has bedeviled heterologous gene expression in 
E. colt: the formation of inclusion bodies. We have demon- 
strated the utility of the system by producing a total of eleven 
different mammalian cytokines and growth factors as soluble, C- 
terminal, thioredoxin fusions in the E. coli cytoplasm. For these 
proteins the thioredoxin fusion system has obviated the time- 
consuming process of in vitro protein refolding, since most have 
been produced previously in E. coli only in an insoluble form. 
Not only are these proteins soluble as thioredoxin fusions, but 
many also exhibit high levels of bioactivity— indicating that they 
have adopted their correct conformations. In addition, there are 
three other advantages that the thioredoxin fusion system can 
provide: high overall gene expression levels, a simple initial 
purification step through a selective release of fusion protein 
from the E. coli cytoplasm by osmotic-shock, and the ability to 
use heat treatments as an easy purification tool. 

We note that the use of thioredoxin as a fusion partner has 
several naturally occurring counterparts. For example, the 
sequences of the mammalian protein disulfide isomerases (PDI) 
can each be thought of as comprising two complete "thioredoxin 
domains" Hanking a central "non-thioredoxin domain" of 200 
amino acids 29 . An identical arrangement of terminal full length 
"thioredoxin domains" has been identified in the sequence of rat 
phosphoinosi tide- specific phospholipase C (PLC-I) 30 , although 
PDI and PLC-I share no other structural or functional homolo- 
gies in the central "non-thioredoxin" region. A third instance of 
a naturally occurring thioredoxin fusion protein is a unique 
arrangement of three complete "thioredoxin domains", two at 
the N-terminus and one at the C-terminus, flanking a central 200 
residue long "non-thioredoxin" region in mammalian ERp72 ?l . 

The vast majority of thioredoxin fusion proteins that we 
have produced are well expressed and produce protein products 
that are both stable in the E. coli cytoplasm and exhibit the 
expected biological activity of the fused heterologous protein. 
A subset of these fusion proteins are selectively released by 
osmotic shock, and some are resistant to thermal denaturation. 



These properties all suggest that both thioredoxin and the fused 
heterologous protein are able to fold correctly when linked 
together. Since thioredoxin has a very tight tertiary fold, with 
over 80% of the polypeptide chain involved in elements of 
strong secondary structure, its folding pathway is probably very 
resistant to any perturbations that may be caused by the presence 
of fused heterologous proteins. This may in fact be the under- 
lying reason that explains why thioredoxin is such a good 
fusion partner. Thioredoxin's robust folding characteristics are 
further illustrated by our finding that fusions carrying peptide 
insertions in the active site-loop not only fold normally, but often 
retain the inherent thermal stability of the wild-type protein (e.g. 
TRX-MPHANT). 

Why should heterologous proteins fold correctly in the 
E. coli cytoplasm as fusions to thioredoxin when by themselves 
they would accumulate in inclusion bodies? It has been sug- 
gested that inclusion bodies arise by the inappropriate aggrega- 
tion of partially-folded or incorrectly- folded intermediates 3 . It is 
possible that by physically linking a heterologous protein to a 
stable and highly soluble fusion partner such as thioredoxin 
these aggregates might be prevented from forming, allowing 
correct folding to occur eventually. The fact that thioredoxin is 
translated first may allow it to fold before its nascent C-terminal 
fusion partner, in this way being able to passively "interact" 
with the partner as it emerges from the ribosome. The nature of 
this "interaction" is unclear, one possibility is that thioredoxin 
would act as a covalently-i inked "molecular chaperon", fulfill- 
ing a solubilizing role similar in some ways to authentic chap- 
eron proteins". In contrast to the trans-zcxmg chaperonins, 
however, co-translation would dictate a close physical contact 
between the thioredoxin and heterologous partner protein 
"domains", facilitating any potential interactions. Indeed we 
have found that enhanced production of IL-11 in £. coli required 
a physical linkage to thioredoxin, and could not be accomplished 
by merely increasing intracellular concentrations of thioredoxin 
in trans (data not shown). It is important to note that the surface 
of thioredoxin has evolved to allow physical contact with a 
number of different proteins in its role as an intracellular oxido- 
reductase 14 . This property may also help to explain its ability to 
confer solubility on partner proteins. 

Disulfide crosslinks can be important stabilizing structures 
for proteins 15 , yet the reducing environment of the E. coli cyto- 
plasm makes stable disulfide formation very difficult, leading to 
another possible mechanism for inclusion body formation in 
E. coli: thermal denaturation and precipitation of heterologous 
proteins at physiological temperatures due to an inability to form 
disulfides. Perhaps surprisingly we find that many mammalian 
proteins that are disulfide crosslinked in their natural state are 
expressed as soluble thioredoxin fusions at physiological tem- 
peratures in E. coli. Maybe thioredoxin, by acting as a "chap- 
eron", can keep partially denatured proteins lacking their 
normal disulfides from forming into insoluble aggregates. We 
have noticed that some thioredoxin fusions become more solu- 
ble as the growth temperature of cells expressing them is low- 
ered. This has also been reported for a number of heterologous 
proteins expressed alone in the E. coli cytoplasm 34 , and may be 
due to a reduction in thermal denaturation. Alternatively the 
lower growth temperature may diminish the strength of some 
inappropriate hydrophobic interactions that may otherwise lead 
to aggregate formation. Nevertheless we find that for any partic- 
ular growth temperature the thioredoxin fusion is invariably 
much more soluble than the heterologous protein expressed 
by itself. 

We have also not yet eliminated the possibility that the over- 
expression of thioredoxin or thioredoxin fusions can change 
the redox environment of the E. coli cytoplasm, or that over- 
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expressed thioredoxin or thioredoxin fusions acting as protein 
disulfide ox ido- reductases 14 can act directly on heterologous 
proteins. 

The synthesis of short peptide sequences in E. coli has 
proven difficult in the past— they are usually either rapidly 
degraded by host peptidases or, less frequently, accumulate in an 
insoluble form. We have shown that thioredoxin contains a 
permissive site that allows for the insertion of a wide variety of 
peptide sequences. Peptides can now be made, soluble and in 
large amounts, as internal fusions into the thioredoxin active-site 
loop. Moreover in that location peptides occupy an accessible 
position on the protein's surface. One potential use for this is as a 
convenient way of generating immunogens. However a more 
exciting possibility is the use of thioredoxin as a vehicle for 
producing libraries of random peptide sequences intracellularly 
in E. coli or indeed in other organisms— since the inherent 
stability of E. coli thioredoxin should not be limited to the 
bacterial cytoplasm. 

Osmotic release of heterologous proteins from the cytoplasm 
of £. coli is a rare attribute which has been described in just a 
few instances 3536 . Thioredoxin fusions now enable this highly 
desirable property to be extended to other heterologous proteins. 
Osmotic release is a rapid, inexpensive and effective technique 
that not only removes the majority of protein contaminants, but 
also removes most of the nucleic acids that are normally present 
in bacterial lysates. The mechanism of the process is unclear, 
but the quantitative release of £. coli thioredoxin by osmotic 
shock 18 suggests efficient targeting to an osmotically-sensitive 
cytoplasmic compartment, probably by a specific thioredoxin 
sequence or some element of tertiary structure. Our finding that 
a number of thioredoxin fusion proteins can also be released in 
good yield from the cell by osmotic shock suggests that this 
targeting element remains effective. However we also find that 
not all thioredoxin fusions can be efficiently released, perhaps in 
these cases because the targeting element is obscured by the 
heterologous protein "domain". 

Cleavage of the fusion protein is clearly an essential part of 
any fusion expression system. We chose to use a protease with 
high specificity and broad utility, enteropeptidase 26 , as our initial 
cleavage reagent and have shown that it can be highly effective 
in that role. However the cleavage site in the linker region of 
our thioredoxin fusion vector can be easily exchanged for other 
sites as necessary, and many of the different agents, both enzy- 
matic and chemical, that have been used previously to cleave 
fusion proteins 37 are applicable to the thioredoxin system. With 
this added flexibility, we expect that the thioredoxin gene fusion 
expression system will prove useful both for small-scale 
research applications as well as for the large-scale production of 
biopharmaceuticals. 

Experimental Protocol 

Bacterial strains and plasmids. All expression work was performed 
in E. coli K 12 strain G1724, (ATCC 55151) a derivative of RB791 (ref. 38) 
( = W3110 lacI<i/acPL8), which contains the bacteriophage X repressor (cl) 
gene stably integrated into the chromosomal ampC locus. The cl gene is 
under the transcriptional control of a synthetic Salmonella typhimurium 
trp promoter, integrated upstream of cl in ampC. The thioredoxin fusion 
expression vector, pTRXFUS (Fig. 1) was constructed using standard 
techniques^. The plasmid is based on pUC-18 (ref. 40) and contains a 
colEI origin of replication, a /^-lactamase gene as a selectable marker, 
and the bacteriophage X pL promoter 41 located upstream of the E. coli 
trxA gene. A short DNA sequence encoding a linker peptide, 
*-GSGSGDDDDK" is positioned at the 3 '-end of trxA. serving not only 
to fuse heterologous proteins with thioredoxin but also providing a spe- 
cific enteropeptidase site 2 ** to allow for subsequent cleavage. Further 
downstream in the vector lies a polylinker DNA sequence containing 
convenient restriction sites and also a transcription terminator sequence 
based on that found following the E. coli asp A gene 42 . A variety of 
mammalian cytokine and growth factor cDNA sequences were introduced 
into the pTRXFUS vector, including those encoding mature forms of 
human 1L-3 (ref. 43), IL-6 (ref. 44), MlP-la (ref. 45), IL-11 (ref. 27), 
BMP-2 (ref. 46) and M-CSF (residues 1-223) (ref. 47), and also the 



murine cytokines 1L-2 (ref. 48), 1L-4 (ref. 49), 1L-5 (ref. 50). L1F (ref. 51 ) 
and steel factor (SF. residues 1-164) (ref. 52). Wild-type E. coli thiore- 
doxin was expressed using a vector. pALtrxA-781, which contained the 
normal tr.xA translation terminator in place of the sequence encoding the 
3'-linker peptide found in pTRXFUS. The DNA sequence encoding the 
active -site loop of E. coli irxA contains a site for the restriction enzyme 
Rsrll (Fig. 1). This site is unique in pALtrxA-781, facilitating the intro- 
duction of synthetic DNA sequences at this location in thioredoxin. The 
plasmid pALtrxA-MPHANT codes for a thioredoxin fusion containing 
the 25 residue peptide sequence "-PGSGRPLAVKVFSYIDDDDKGP 
GSG-" inserted into the active-site loop. 

Expression of fusion proteins. Strain GI724 containing the thiore- 
doxin fusion expression plasmid of interest was grown at the desired 
temperature in IMC broth (M9 medium 53 supplemented with 0.5% w/v 
glucose, 0.2% w/v casamino acids and lOO/ig/ml ampicillin) until the 
culture reached an A S50 of 0.5. In a manner similar to that described by 
Mieschendahl et al. M , the amount of transcription initiated from the pL 
promoter on pTRXFUS is determined by the level of cl in G1724, which 
itself is controlled by cytoplasmic tryptophan levels. In the presence of 
high levels of tryptophan, cl synthesis is repressed in the host strain, 
transcription from the pL promoter is induced, and plasmid-directed gene 
expression proceeds. Fusion protein synthesis induced by the addition of 
lOOug/ml tryptophan to the culture medium was allowed to continue for a 
desired time, typically 4 hours. Lysates were prepared by suspending the 
cells in ImM phenylmethylsulfonyl fluoride (PMSF)/1 mM /7-aminoben- 
zamidine (PABA)/20mM Tris-Cl. pH8 and then passing them through a 
French pressure cell at 20,000 p.s.i. Proteins in the "soluble" fraction of 
these lysates were defined as those that remained in the supernatant 
following cent rifugat ion at 15,000xg for 15 minutes. 

Heat treatments. E. coli cells expressing the /rjrA/M IP-la fusion 
were resuspended in 2.5mM EDTA/20mM Tris-Cl, pH8, to a concentra- 
tion of 100 A S50 units/ml before lysis in a French pressure cell at 20,000 
p.s.i.. The unclarified lysate was then heated at 80 °C, with aliquots 
removed at various times. Each aliquot was cooled quickly on ice before 
clarification by cent rifugat ion at 15,000xg for 10 minutes. Soluble pro- 
teins present in the supernatant fraction were analyzed on a 10% SDS- 
polyacrylamide geP\ 

Osmotic shock fractionations. E. coli cells expressing native thiore- 
doxin, the /r.vA/MIP-1 a fusion, or the /rrA/IL-11 fusion were resuspended 
in ice-cold 20% sucrose/2. 5mM EDTA/20mMTris-CI, pH8, to a concen- 
tration of 5A,5o units/ml and kept on ice for 10 minutes. The cells were 
then pelleted by brief centrifugation at 15,000xg for 30 seconds before 
being resuspended gently in an equivalent volume of ice-cold 2.5mM 
EDTA /20mMTris-Cl , pH8. After a second 10 minute incubation on ice, 
the cells were again pelleted by centrifugation at 15,000xg, this time for *0 
minutes. Samples of the original cells, the final supernatant ("shockate"), 
and pellet fractions were analyzed on a 10% SDS-polyacrylamide gel 56 . 

Purifications. All purification steps were performed at 4°C. Cells 
expressing the //-aA/IL-11 fusion were resuspended in 25mM Hepes. 
pH 8.0/5mM EDTA, at a concentration of 0.2g wet weight cells/ ml and 
lysed with three passages through a French pressure cell at 20,000 psi- 
The lysate was clarified by centrifugation at 20,000xg for 30 minutes, and 
the insoluble fraction discarded. The protein concentration of the lysate 
supernatant was diluted to 5 mg/ml with lysis buffer and applied to a QAE- 
Toyopearl 550C anion-exchange column (Toso-Haas). The column was 
washed extensively with lysis buffer, and the bound fraction containing the 
fusion protein was then eluted with lysis buffer containing lOOmM NaCl. 
This eluent was adjusted to 2M NaCl and applied to a phenyl -Toyopearl 
650S column (Toso-Haas). The bound fusion protein was eluted with lysis 
buffer containing 0.5M NaCl. 

Enteropeptidase cleavages. The /r.rA/IL-Il fusion protein purified as 
described above was dialyzed against lysis buffer to reduce ionic strength 
prior to enteropeptidase cleavage. The fusion protein was then combined 
with bovine enteropeptidase (Biozyme, EK-3 grade, specific activity 
= 1.8 x10 s units/mg) in 25mM Hepes, pH 8.0/5mM EDTA, at 
an enzyme:substrate ratio of 1:2000 (w/w) and incubated at 37°C for 
15 hours. The EDTA was added to the reaction to eliminate minor second- 
ary proteolysis of the IL-11 product. The reaction was terminated by 
addition of p-aminobenzamidine to5mM, and the products were analyzed 
on a 10% tricine geP. 
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